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The aberrations of an optical system can be described in terms of the wave aberrations, defined as the 
departure from the ideal spherical wavefront; or the ray aberrations, which are in turn the deviations 
from the paraxial ray intersections measured in the image plane. The classical connection between the 
two descriptions is an approximation, the error of which has, so far, not been quantified analytically. 

We derive exact analytical equations for computing the wavefront surface, the aberrated ray directions, 
and the transverse ray aberrations in terms of the wave aberrations (OPD) and the reference sphere. We 
introduce precise conditions for a function to be an OPD function, show that every such function has an 
associated wavefront, and study the error arising from the classical approximation. We establish strict 
conditions for the error to be small. We illustrate our results with numerical simulations. Our results 
show that large numerical apertures and OPD functions with strong gradients yield larger approximation 
errors. 
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1. INTRODUCTION 


The geometrical theory of aberrations adjusts the predictions of 
paraxial optics to a more realistic depiction of how a real lens 
performs. Two of its descriptors are the ray aberrations and the 
wave aberrations. Both concepts are directly related, as has been 
shown in the classic literature [1-4]. The ray aberrations describe 
the deviation between the aberrated rays and the paraxial/ideal 
rays as a transverse distance measured in the image plane. The 
wave aberration describes the deviation of the aberrated wave- 
front as compared to an ideal spherical wavefront that produces 
a perfect image point. Equivalently, one can consider the wave 
aberrations as the differences in time of flight of the light along 
an aberrated ray with respect to the time it would take to reach 
the image along a paraxial ray, hence the alternative use of the 
name optical path differences (OPD). 


Both concepts, the ray aberrations and wave aberrations are 
commonly related by means of [1, 5] 


aw 

dx 

aw 

dy 



( 1 ) 


where W represents the wave aberration (OPD), Cj and Cy 
are the ray aberrations, and r is the radius of the ideal wavefront, 
also known as the reference sphere. 

The approximation in Eqs. (1) is commonly held to be valid 
for small numerical apertures and small aberrations. In spite 
of this, they are widely used for instance in the analysis of the 
Hartmann-Shack sensor [6, 7] or in optical design software. 

While exact relations between quantities indirectly related to 
the aberrations [8], or based on re-definitions of the optical path 
difference [9], have appeared in the literature, there is, so far, 
no such relation for the standard definition of the optical path 
difference as a phase difference along the aberrated ray. 

In this paper, we derive exact analytical equations to compute 
wavefront points, aberrated ray directions and the transverse 
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ray aberrations in terms of the OPD function along the aber¬ 
rated ray Whereas the classical equations, Eqs. (1), are only an 
approximation, the new equations are applicable to large nu¬ 
merical aperture settings and for arbitrary differentiable OPD 
functions. We prove the exactness of the equations by validating 
the defining properties of the wavefront, i.e. the distance to the 
reference sphere and the orthogonality with the aberrated rays. 
We show that the classical equations for the ray aberrations are 
a special case of our equations and detail the conditions for a 
good approximation. Finally, we evaluate the approximation 
error quantitatively. 

2. OVERVIEW 

The main tool for our derivation is a Huygens-like interpretation 
of the wavefront as an envelope of spheres with a varying radius 
that is given by the OPD function. Fig. 2a. 

This conception enables us to perform limit considerations 
that are most suitably studied in the tangent space of the ref¬ 
erence sphere. The derivations are initially performed in this 
local space rather than in exit pupil coordinates. In the new 
coordinate system, we arrive at exact analytic equations for the 
wavefront and the aberrated ray directions, which, by means of 
a suitable transformation can be related to the original exit pupil 
coordinates. In this scheme, the aberrated ray directions can be 
computed without differentiating the wavefront. 

As a result, we obtain a set of rays with origins at the wave- 
front that can be propagated to the image plane to compute the 
exact analytic expressions for the transverse ray aberrations. 

We perform the derivation in several steps. First, Sect. 4, we 
perform the basic geometric construction of a wavefront tangent 
in one dimension in the canonical setting afforded by the tangent 
space construction outlined above. 

We leave the detailed definition of the required coordinate 
transformation for the discussion of the two-dimensional case. 
Sect. 5. We introduce the transformation between global and 
local coordinate frames, putting special emphasis on the trans¬ 
formation of functions defined in global exit pupil coordinates 
to the local tangent frame systems. 

We then generalize the one-dimensional geometric argument 
to two dimensions. Sect. 6, and derive local expressions for the 
wavefront and the aberrated ray directions. We continue by 
linking these expressions back to exit pupil coordinates, both in 
their arguments and in their values. We arrive at the key results, 
Fqs. (35) and (37). 

Finally, we make a connection to the transverse ray aberra¬ 
tions and the classical approximation, Fqs. (1), elucidating the 
conditions for a valid approximation. Sect. 8 demonstrates exem¬ 
plary applications and quantitative properties of the equations. 

The Appendix contains a proof of the wavefront properties of 
the derived quantities and establishes that every OPD function 
corresponds to a wavefront. 

3. THE WAVE ABERRATION FUNCTION (OPD) 

A. OPD Definitions 

In the classic literature, there are two different recurrent defini¬ 
tions for the wave aberration function (OPD). 

The standard definition [1, 3, 5] considers the path length 
between points on the aberrated wavefront and on the reference 
sphere, connected along the direction of the propagation of the 
aberrated ray. The OPD value is reported for the coordinate of 



Fig. 1 . (Color online) Alternative definitions of the wave aberra¬ 
tions. The exit pupil is located along the X axis, while the image 
plane is located at a distance zg along the Z axis. The wavefront 
is the curve (p{x). The reference sphere R{x) is centered at the 
paraxial image point s and has a radius of r. The aberrated ray 
passing through the point u intersects the image plane at the 
aberrated image point i. The transverse ray aberration is the 
difference i — s. 


the reference sphere point in the exit pupil plane. Its relation to 
the ray aberration is stated in Eqs. (1). 

The alternative definition [3,9] considers the wave aberration 
to be measured along a radius of the reference sphere, again be¬ 
tween two points on the wavefront and on the reference sphere, 
but with the OPD value assigned to the exit pupil location of the 
wavefront point. 

Both definitions are illustrated in Fig. 1 for a point u on 
the wavefront. For simplicity, we refer to the first definition as 
Wray since the path lengths are measured along the ray, while 
the second one is denoted as Wradius- The pupil coordinate Xt 
represents the pupil coordinate position for Wray, while Xu is 
the same for the alternative wave aberration definition. The 
wave aberration value is given a sign that depends on the delay 
relation between the wavefront and the reference sphere. For 
the particular case of point u in Fig. 1, the wavefront is delayed 
which implies a negative sign of the OPD. 

For Wradius f^e resulting relation between ray and wave aber¬ 
ration is different from Eqs. (1) and is reported [9] as: 


radius _ _ 

^ ~ ^radius 

radius _ _^ 2 ) 

^ ~ ^radius 

The advantage of the Wradius definition is a simple computa¬ 
tion of the wavefront surface via triangle relationships. However, 
Eqs. (2) are differential equations as compared to the classical 
Eqs. (1) and therefore, even though they are exact [9], difficult to 
solve. 

On the other hand, for the standard definition, Wray, there is 
no obvious construction of the wavefront surface from the OPD 
values, since the aberrated ray directions are unknown. 

Most authors prefer the Wray definition since it provides a 
direct connection with the pupil function and the calculation of 
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the point spread function (PSF) [2], but often the distinction is 
not clearly made. 

In this article, we derive exact equations for the wavefront 
and the aberrated ray directions for the standard OPD definition 
ySfray and use them to make an exact link with the ray aberra¬ 
tions. 

B. Properties of the OPD Function 

Since the derivations critically depend on the exact properties 
of the OPD definition, we discuss the interpretation underlying 
our derivations in detail. 

We use the standard definition of the OPD function VSfray as 
given above, simply denoting it as W in the following. In con¬ 
trast to most of the literature, we interpret the domain of the 
OPD function to be the reference sphere. The usual parameter¬ 
ization in terms of exit pupil coordinates is, in this sense, one 
parameterization of the function's domain. Other parameteriza- 
tions are possible and we will use this insight to define the OPD 
function in local coordinate systems. Sect. 5. These differently 
parameterized OPD functions all describe the same quantity, i.e. 
the phase delay along an aberrated ray intersecting the reference 
sphere at the position of intersection, just with a different frame 
of reference. 

A direct consequence of the above considerations and the 
fact that the OPD is single-valued is that no wavefront point 
can be at a closer distance to the reference sphere than the OPD 
value. This implies that there is an open ball around any point 
on the reference sphere that does not contain wavefront points. 
The radius of this open ball is equal to the OPD value. A sphere 
with this radius, i.e. the closure of the open ball, is tangent to 
the wavefront. The wavefront can therefore be considered as 
the envelope of a set of spheres with a varying radius that is 
described by the OPD function, a statement of Huygens prin¬ 
ciple. The concept is illustrated in Fig. 2 a), where R{x) is the 
reference sphere, W{x) the OPD function, and (p{x) the resulting 
wavefront. 

A misconception that is often found in the literature is an 
ambiguity between the slope of the wavefront and the slope of 
the OPD function. We emphasize that these two concepts must 
be distinguished. 

A key property of an OPD function is that its gradient mag¬ 
nitude cannot exceed one. To appreciate this point, consider 
a ID setting. Fig. 2 b) and c). As mentioned above, the OPD 
value W(x) implies an open ball without wavefront points sur¬ 
rounding a particular point x on the reference sphere. If the 
norm of the OPD derivative is greater than one \dW/dx\ > 1, 
the radius of this ball changes more quickly than the evalua¬ 
tion position, i.e. \dW\ > \dx\. It follows that one of the balls 
completely contains the other - which is a contradiction since 
both balls, by definition, do not contain wavefront points, but 
are simultaneously tangent to it. It follows that the norm of the 
OPD derivative cannot exceed one. An intuitive interpretation 
of this property is that the OPD would be required to be multi¬ 
valued in this situation. An alternative interpretation is that the 
wavefront can only be represented as an envelope of balls if the 
condition on the OPD norm is satisfied. 

The illustration in Fig. 2 b) and c) shows the two cases for 
a finite displacement Ax. The constraint that one ball does not 
contain the other yields the triangle inequalities illustrated in 
the Figure. Passing to the limit as Ax ^ 0 yields the condition 
\dW/dx\ < 1. 

With these prerequisites, we introduce the following 



a) Huygens-like wavefront b) W{x) <W(x + Ax) + Ax c) W{x + Ax) < W{x) + Ax 


Fig. 2. (Color online) a) Huygens-like wavefront construction 
illustrated - the wavefront ^(x) is the envelope of spheres with 
radii defined by the OPD function W(x). b) and c) Illustrating 
the condition that the gradient norm of the OPD function is 
smaller or equal to one. Applying a Taylor expansion to the 
term W(x + Ax) and simplifying the triangle inequality yields 
the condition. The colors of the mathematical terms correspond 
to the colored segments indicating the distances that constitute 
the triangle. 



Fig. 3. (Color online) Geometry considered. The axes XZ repre¬ 
sent the original coordinate system or global coordinates and the 
axes XZ represent the local coordinates for the point p(xp). We 
utilize circumflex symbols to distinguish quantities of the local 
system from those of the global one. 

Definition. OPD function: An OPD function W : ^ R is a 

twice-differentiable function with a gradient norm smaller or 
equal to one. Its domain is the reference sphere. 

The existence of the second derivatives is a technical require¬ 
ment for subsequent developments. As implied by the definition, 
the gradient is to be taken on the reference sphere. 

In contrast to the above discussion, the wavefront derivative 
can have arbitrary values. It follows that there are wavefronts 
that cannot be represented by an OPD function. However, we 
show in the Appendix that all OPD functions, satisfying the 
above conditions, describe wavefronts by deriving explicit con¬ 
struction rules. The specification of an OPD function is therefore 
a sufficient condition for a wavefront to exist. 

4. WAVEFRONT POINTS AND ABERRATED RAY DIREC¬ 
TIONS IN A 1D CANONICAL SETTING 

We develop the major geometric reasoning of our derivation in 
a canonical one-dimensional setting. Without loss of generality, 
we consider an orthonormal local coordinate system with its 
origin on the reference sphere and spanning its tangent space. 
To complete the basis, we use the tangent plane normal, oriented 
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towards the paraxial image point s. Such a local coordinate 
system is depicted in Fig. 3 for the current one-dimensional 
setting, for the point p. In local coordinates p = (0,0)^, i.e. it 
coincides with the local coordinate origin. 

As detailed in Sect. 5 for the full 2D case, these local coordi¬ 
nate systems can be obtained by a rigid body transformation of 
the exit pupil coordinate system. 

All quantities relating to the local system are denoted with 
circumflex symbols. In particular, p is the local coordinate origin, 
X and Z are the local coordinate axes and W(f) is the OPD func¬ 
tion parameterized in local coordinates x. The specifics of this 
parameterization are also covered in Sect. 5 for the 2D-setting. 
In addition, we denote the local representation of the reference 
sphere as R{x) := r — (r^ — and the local aberrated ray 

directions as h(f). Given these quantities, we may express a 
wavefront point as 

Hxq) = q{Xq) + W{x^) ■n{x^), (3) 

where q{xq) = (xq, A{xq))'^ is a point on the reference sphere 
corresponding to the local tangent space coordinate Xq, see Fig. 3. 
The local wavefront function ^ : R ^ yields the local 2D 
coordinates of the wavefront point. Its horizontal coordinate 
may be different from the evaluation position Xq. 

A. Constructing a Tangent to the Wavefront 

Considering the fundamental definition of the wavefront as the 
surface which is normal to the aberrated rays, we can consider 
a circle with radius = W(0) that is centered in the point 
p = (0,0)^, i.e. in the origin of the local coordinate system. This 
circle must be tangent to the wavefront since the wavefront is, 
by definition, located at a distance W(0) from point p. We wish 
to determine the point of intersection between the circle and the 
wavefront. 

For this, we introduce a neighboring second circle with its 
center at point q = {xq,R{xq))^, also centered on the refer¬ 
ence sphere and also tangent to the wavefront with radius 
r 2 = W{xq). Both circles are shown in Fig. 4. 

We now consider the circle at q to be approaching the circle 
at the center of the local coordinates p. In the limit, as the hori¬ 
zontal distance Ax = Xq between the centers tends to zero, the 
two circles coincide and the tangent to both becomes the tangent 
of the wavefront. Expressing this intuition mathematically leads 
to equations for the ray direction and the wavefront itself. 

Considering in more detail the geometry of Fig. 4, the tangent 
to both circles intersects the line connecting their centers at the 
point c, where distances between the points are related by the 
simple triangle relation 


(4) 


ri ^ r2 

|pc| Iqcl' 

Using |qc| = |pc| + |pq| we arrive at the following equation 


|pc| = 


ulpql 

ri-Ti 


(5) 


The point c is then given by: 


c = p 


(q-p)u 

ri-Ti 


( 6 ) 


We now introduce the explicit expressions for the vector quan¬ 
tities 



Fig. 4. (Color online) Approaching circles. As point q moves 
towards p along the reference sphere, the circles get closer. The 
sub-images (a) and (b) represent a decrease in Ax. 


P 

q 


n 


ri 


(Ax,X(Ax))^ = (^Ax,r 
W(0), 

dW 

W(Ax) = W(0) + ^ 



Ax + 0(Ax^), 

0 


(7) 


and insert the definitions into Eq. (6). We use a Taylor ex¬ 
pansion for W at the point q for studying the limit as point q 
approaches point p. Since the OPD function W is differentiable, 
the relation is exact in this limit. We thus obtain an explicit 
equation for the vector 


W(0) 


Ax 

r — (r^ — Ax^)^/^ 


^|oAx + 0(Ax2) 

which, as Ax tends to zero becomes 


( 8 ) 


lim c = 

Ax^O 



(9) 


where the limit has been determined using the rule of 
I'Hdpital. The previous equation is illustrated in Fig. 5. The 
limit behavior of the point c is finite if ^ |o 7 ^ 0 and it is located 

on the X axis. For the case that ^ |o = 0/ the point c is located 
at infinity. However the calculation of the wavefront point is 
now trivial since it is located on the axis Z at a distance W (0) 
from the origin of the local coordinates. 

For the limiting case where c is finite, we determine the tan¬ 
gent point with the wavefront utilizing the triangle between 
the tangent point, the origin of the local coordinate system, and 
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Fig. 5. (Color online) Limit when circles coincide. The point c is 
now located on the axis. 


this vector component represents exactly the normalized ray 
direction: 



We interpret the result as showing that, in the canonical situ¬ 
ation, the OPD derivative equals the (negative) direction cosine 
of the ray. 

Eqs. (14) and (15) allow us to obtain the exact wavefront point 
and aberrated ray direction knowing only the local function for 
the reference sphere R{x) and the local OPD function W(f). 


the limiting point c. This triangle is illustrated in Fig. 5. Using 
the geometry, the tangential point, with coordinates (x,z)^ is 
calculated as 

X = C X cos((x), 

z = (c^-jE2)l/2^ (JQ) 

Trigonometry yields 



Note that there may be two solutions since two tangent lines 
to the circles exist. However, since the sign of the OPD function 
indicates the direction of the wavefront with respect to the refer¬ 
ence sphere, we can unambiguously select the correct solution. 


5. LOCAL COORDINATE SYSTEMS AND FUNCTIONS 
THEREIN 

So far, we have been describing the canonical situation in one 
dimension. For a generalization of the result, we need to 1) detail 
the construction of the local coordinate system as well as the 
transfer of the relevant functions, and 2) expand the results of 
the previous section towards two dimensions. 

As mentioned in Sect. 4, we require an orthonormal tangent 
frame to the reference sphere and a transformation from 
this local system to the global exit pupil coordinate system. The 
underlying reason for requiring orthonormality is a preservation 
of distance measures in relation to the global coordinate system. 

The tangent frame of the reference sphere and its associated 
transformation are parameterized by an evaluation point p = 
on it. The reference sphere in global exit 
pupil coordinates is given by 


B. Wavefront Point and Aberrated Ray 

Replacing the auxiliary variables with their definitions in terms 
of the OPD function. 


c = W(0), 


W(0) 



( 12 ) 

we arrive at explicit expressions for the local coordinates of 
the wavefront point: 


X 


z 




W(0) 1 



We denote it in vector notation as ^(0) = (x,z)^: 


(13) 


^(0) = W(0) 


1 


dW I 
dx lo 




(14) 


Here we notice the significance of the condition that the norm 
of the OPD derivative be smaller or equal to one. Sect 3. 

Eq. (14) allows to simultaneously obtain the normalized ray 
direction. As seen from the equation, the vectorial part is a) a 
unit vector, and b) multiplied by the OPD value W (0). Therefore, 


H^V'Vv) = Zs - - (Xp - Xsf - {yp - (16) 

The point s = (xg, yg, Zg) is the paraxial image point, also in 
global coordinates, and r = (xg^ + yg^ + Zg^)^/^is the radius 
of the reference sphere. The exit pupil is centered in the global 
origin, and the positive Z-axis is pointing towards the image 
plane, which is located at a distance Zg. 

A. Local Orthonormal Tangent-Frame to the Reference 
Sphere 

We construct an orthonormal local tangent frame coordinate 
system at point p by resorting to spherical coordinates around 
the paraxial image point s. Suitably normalized derivatives with 
respect to the spherical coordinates then provide us with the 
local linear approximation of the reference sphere, i.e. with its 
tangent plane. We choose to position the poles of the spherical 
coordinate system in the image plane, in particular along the 
Y-axis, in order to avoid singularities of the reference sphere 
parameterization in the space between the image plane and the 
exit pupil. 

In particular, we use the following assignment of Euler angles 




Hxp,yp) = 


p -^s 

-1 (Vv ~ 


, and 


Pi^P'Vp) = const. = r, 
the inverse equations of which are given by 


( 17 ) 






















Research Article 


Journal of the Optical Society of America A 6 


X = |Ocos(0) sin((^) + Xs, 

y = pcos{(p) +ys, and 

z = |Osin(0) sin((/>) + Zg. (18) 


The variables p, 0 and (p all depend on the evaluation position 
{xp,yp) in global coordinates, Eq. (17). The unit vectors of the 
tangent frame are then obtained via 


/ ^ I \ 

d9 I (^p/i/p) 


K[xp,yp) — 


psincp 


^ I 

V J 
\ 


( — sin(0) ^ 


^{xp,yp) — - 


\ cos(0) y 

^ cos(0) COs((/)) ^ 
— sin((^) 
sin(0) cos((^) j 


z[xp,yp) — 


^ a^l(xp,yp) 

I i^p/Vp) 

\ 3^l(^p,yp) / 

^ %\{x„y,) \ ( - cos(0) sin(^) ^ 

dp I i^p/Vp) 

\ J 


— cos{(p) 
y — sin(0) sin((/)) 


(19) 


Here, the vectors have been normalized and Uz has been 
inverted to point towards the paraxial image point s. Since the 
unit vectors depend on {xp,yp), the transformation from the 
local coordinate systems to the global one is parameterized by 
the point of evaluation p. The associated matrix is given by 



and its spatial derivatives are denoted as 


( 20 ) 


3Tp-i 

dxp 



( 21 ) 


and respectively We use homogeneous coordinates in 
order to describe rigid body transformations, including their 
translational part, as matrix-vector products. 

B. Transforming Exit Pupil Functions into Local Coordinates 

Key to the transfer of functions into the local coordinate systems 
is the realization that they are defined with respect to a common 
surface that is known in both systems. In particular, the reference 
sphere is given in global coordinates by Eq. 16, whereas in any 
local system it is 


R{x,y)=r-{r^-x^ (22) 

due to the symmetry of the sphere. Eurther, as mentioned 
in Sect. 3, we define the domain of the OPD function W as the 
surface of the reference sphere. The exit pupil coordinate repre¬ 
sentation that is commonly used is then a parameterization of 
this function on the sphere. Let us denote this parameterization 
as W(x,y). 



Fig. 6. (Color online) Transfer of function values between co¬ 
ordinate systems: The center point for the transformation is p 
and the point q is an arbitrary point on the reference sphere. A 
function / defined on the reference sphere whose value is given 
in exit pupil coordinates by /(x^) must return the same value 
for the local coordinates f{xq), i.e. f{xq) = f{xq). 


The OPD function value for a point q = {xq,yq, R{xq,yq))^ 
on the reference sphere is therefore obtained by evaluating 
W{Xq,yci)- 

In a local coordinate system, the point q = has a dif- 
ferent set of coordinates q. However, the function values of the 
OPD function W{xq,yq) = W(xq,yq) in both systems must be 
the same since q and q are only different coordinates of the same 
point, see Fig. 6. We therefore define 

W(x,y) := W(x(x,y),y(x,y)), (23) 

where the functions 

^ x{x,y;xp,yp) ^ 
y{x,f,xp,yp) 
z{x,y;xp,yp) 

\ 1 / 

perform a remapping of the two different parameterizations 
of the reference sphere. For clarity, we have explicitly denoted 
the dependence on the evaluation point p. For the transfer of 
functions between the two systems, only the functions x(x,y) 
and y(x,y) are significant. 

The construction above serves as a general tool to transfer 
all quantities of interest into a local coordinate system, i.e. in 
addition to the OPD function W(x, y), the wavefront cp{x,y) and 
the normalized ray direction of the aberrated ray n(x,y) are 
defined locally via 

^{x,y) := q){x{x,y),y{x,y)), and 

n(f,y) := r\{x{x,y),y{x,y)), (25) 

by linking them to their global definition in exit pupil coordi¬ 
nates. 


Tr 


-1 


X 

y 

R{x,y) 

1 


(24) 
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Fig. 7. (Color online) Geometry considered. XYZ are the global 
coordinates. XYZ are the local coordinate axes at point p. The 
wavefront W{x,y) and the reference sphere X{x,y) are plotted 
along with the tangent space for the point p. 



Fig. 8. (Color online) Approaching spheres. The wavefront sur¬ 
face and the reference sphere are represented as single profiles to 
simplify the graphics. The two approaching spheres are centered 
on the reference sphere and their tangent cones have vertices 
and cy. 


6. EXACT 2D CALCULATION OF THE WAVEFRONT 
FROM THE WAVE ABERRATION 

A. Constructing a Tangent Plane to the Wavefront 

The derivation for the 2D case utilizes the ID result of Sect. 4 and 
proceeds along similar lines. To restate the problem: knowing 
the reference sphere R{x,y) and the OPD function W{x,y) we 
want to calculate the corresponding wavefront q){x,y) : 

R^, which is now a function that takes a 2D coordinate to a point 
in three-dimensional space. According to the wavefront and 
OPD definitions. 


<p{xp,yp) = p{xp,yp) + W{xp,yp)n{xp,yp), ( 26 ) 


and 


I 

lim = 

V 



/ 

lim cv = 

\ 


0 

-W(0,0)/f 1(00) 

0 


\ 


/ 


( 27 ) 


( 28 ) 


where n{xp,yp) is again the unknown normalized direction of 
the aberrated ray and p(xp,yp) = {xp,yp,R{xp,yp))^ represents 
the evaluation point on the reference sphere. The derivation is 
similar to the ID case, namely the combination of a coordinate 
transformation and the calculation of a limit. The geometry of 
the setting as well as the global and local coordinate systems 
involved are illustrated in Fig. 7. 

We continue the derivation in local coordinates. We now 
study two spheres with their centers located on the surface of the 
reference sphere. Both spheres are tangent to the wavefront. One 
of them is fixed at the origin of the local coordinate system while 
the other one approaches it. Ultimately, the two spheres coincide 
at the limit. Since the second sphere can be approaching from 
an arbitrary direction, we instead consider two pairs of spheres 
moving along the local coordinate axes X and Y, respectively. In 
doing so, we obtain equations where the X and Y components 
are independent. To each pair of spheres, there is a tangent cone 
with its vertex located in the XZ and YZ planes respectively. 

The two pairs of spheres are illustrated in Fig. 8. In this Fig¬ 
ure, the wavefront surface and the reference sphere previously 
plotted in Fig. 7, are now replaced by single profiles. In the side 
views we see the two pairs of spheres. The objects tangent to 
each are now cones with vertices cy and cy. Employing the same 
analysis as in Fig. 5, we find the equations for the coordinates of 
the vertices. 

Consequently, when the spheres are coinciding at the limit, 
we find the coordinates for the vertices of the cones to be 


where Ax and Ay represent the distances between centers, 
along the corresponding axis, for each pair of approaching 
spheres. The coinciding spheres are shown in Fig. 9. The limit 
points cy and cy are located on the axes X and Y respectively. 
The tangential circles are located in planes parallel to the XZ 
and YZ planes. 

In the case that either limit approaches infinity, the solution 
reduces to the ID case. If both limits are infinite, the wavefront 
point is at a distance W (0,0) along the Z axis. 

Similar to the ID case where we obtained the wavefront point 
W (0) using Eqs. (10), we compute the wavefront point W (0,0) 
as the point of intersection of the limiting tangent circles, as 
shown in Fig. 9. The X and Y coordinates are obtained using 
triangle relations as in Fig. 5. The Z component is determined by 
observing that both circles in Fig. 9 are part of the limit sphere. 
The wavefront in the local coordinate system is therefore given 
by 




^(0,0) = W(0,0) 




8W I 

dx \ ( 0 , 0 ) 

aw I 

ay I (0.0) 



\ 

/ 

( 29 ) 


The normalized ray direction in local coordinates is the vec¬ 
torial part of the wavefront point and is given by 
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Fig. 9. (Color online) Coinciding spheres. The vertices are now 
located along the corresponding axis. The tangent circles inter¬ 
sect at two points, for two solutions of the wavefront ^(0,0). We 
select the solution whose sign fits the OPD. 


h(0,0) = 




aw 

"SF 

aw 

■ ^ 


( 0 , 0 ) 

( 0 - 0 ) 


(^ 1 ( 0 , 0 )) -(^ 1 ( 0 , 0 )) ) 


(30) 


B. Explicit Expressions in Global Coordinates 

In the previous sections we have derived equations for the wave- 
front point and the aberrated ray direction in local coordinates. 
We now relate them back to the global coordinate system. 

Since both the parameters and the values of the functions in 
Eqs. (29) and (30) are in local coordinates, two steps are neces¬ 
sary: 

1. a replacement of local wavefront and wavefront derivative 
terms by their corresponding expressions in global coordi¬ 
nates, followed by 

2. a back-transformation (using Eq. (24)) of the still local result 
into global coordinates. 

Before carrying out the two steps, we note that the local 
OPD function W(0,0) evaluated at the local origin is equal to 
the global OPD function W{xp,yp) evaluated at the exit pupil 
coordinates of point p by Eq. (24). 

Second, the 2D expression for the local wavefront derivatives 
in global coordinates is 


1 ( 0 , 0 ) 

dW I 

af 1(0,0) 


dw 

ar 

aw 


= vw|(0,0)=rvw|(,^,,^), 


(31) 


where 


with 


f-i 

^21 


T = 


n-l 

^12 


j-1 

^11 

rp-1 

il2 


J-1 

^21 

J-l 

±22 


(32) 


122 ^ being the elements of the upper 


left 2 X 2 sub-matrix from Eq. (20). The relation can be found 
by differentiating the definition of the function W, Eq. (23); the 
Jacobian of the coordinate transformation, Eq. (24), is found to 
be T and Eq. (31) follows by the chain rule. Note that T is the 


transpose of the upper left 2x2 sub-matrix of T~^ and therefore 
also dependent on p. 

With these prerequisites we can now perform the transforma¬ 
tion in two steps. 


Wavefront Point: Performing step 1 yields 


^(0,0) 


^ -W (0,0)VW|(o,o) ^ 

w(o,o) • y^i-vw^-vw|(o_o) 

1 / 

^ -W(xp,yp)rVW|(,^,y^) ^ 

w(xp,yp)^i - vw^T^ • . 

1 / 

(33) 


Defining 

nf = il- ( 34 ) 


and performing step 2 yields 


tp{xp,yp) 


-W-TVW \ 


Tp-V(O-O) = Tp-i 


W-Hf 




(35) 


which is the 3D wavefront point in global (exit pupil) coordi¬ 
nates as a function of the exit pupil coordinates. Note that the 
X- and Y-components of cp{xp,yp) will typically not be {xp,yp) 
unless W{xp,yp) = 0, Eq. (24). 

Ray Direction: The aberrated ray direction is similarly trans¬ 
formed into global coordinates. Step 1 results in 


h(0,0) 




Uf 

0 / 


Applying step 2, we obtain 


(36) 


T^{xp,yp) = T 1 






Uf 

0 


(37) 


Since only three components of the homogeneous direction 
vector n are non-zero, only the upper left 3x3 matrix of T~^ 
is effective. Eor general transformations M, normals need to be 
transformed via M~^ [10]. In our case, the upper left 3x3 
matrix of T~^ is orthonormal and therefore its own inverse 
transpose. 

Eqs. (35) and (37) are the key results of this paper. An an¬ 
alytical proof that the ray directions n are orthogonal to the 
wavefront cp at every point and that the wavefront is located at 
the OPD distance of W is given in the Appendix. 
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7. CONNECTION TO THE RAY ABERRATIONS 

Once the wavefront point and the aberrated ray direction are 
known, they can be used to compute the transverse ray aber¬ 
rations. We make a connection to the classical approximation, 
Eqs. 1, that relates wave and ray aberrations and derive the exact 
conditions for the approximation to be valid. 

The transverse ray aberrations e are obtained by computing 
the aberrated image point i and subtracting the paraxial image 
point s = [xsfVsf^sY from it, i.e. 


e = i — s. (38) 

The aberrated image point i is obtained by computing the ray 
intersection of the aberrated ray with the image plane situated 
at Zg. The aberrated ray passes, by definition, through the wave- 
front point (p{xp,yp), Eq. (35), and has the direction n{xp,yp), 
Eq. (37). We compute the path length t to the image plane by 


, 3 „ 

[n]z [n]z ^ 


where {[(p]x/[(p]y,[(p]z)^ are the components of cp{xp,yp), 
([n]j, [n]y, [n]^)^ the components of r\[xp,yp), and Eq. (26) has 
been used. The aberrated image point can now be written as 


and Ax = {xp — Xg) and Ay = {yp —ys)- Eqs. (42) are the 
exact equations of the transverse ray aberrations in terms of the 
OPD function. We now show that the classic approximation can 
be obtained as a special case. 

Taking the limit as Ax ^ 0, Ay ^ 0 results in 


lim €x 

Ax—^0,Ay—^0 


lim ey 
Ax^0,Ai/^0 ^ 



from which we obtain the classical form, Eqs. (1), in the case 
of 11 VW| I C 1. The conditions for the validity of the classical 
approximation are therefore: 


(i) the evaluation position [xp,yp) in the exit pupil approaches 
the paraxial image coordinate {xs,ys), and 

(ii) the gradient norm of the OPD 11V W| | ^ 1. 


p + W(xp,i/p)n(xp,yp) + fn(xp,yp) 

(zs — R(xp,yp)) 


P + 
/ , 


-n{xp,yp). 


\ 


n 

X 

n 

z 

n 

y 

w 

k 


(40) 


We see that the point i is indeed in the image plane. Using 
Eq. (38) and ignoring the zero z-component, the transverse ray 
aberrations become 


= {xp - Xs) + {zs - R{xp,yp))^^ 

£y = {yp-ys) + {zs-Hxp,yp))^^^ (41) 

which, with the help of a computer algebra package, can be 
simplified to a closed form in terms of the OPD derivative and 
the paraxial image position s: 


with 


^2 aw 

^ aF 

A-B 

yi aw 

A-B 


(42) 


B 

A 


aw, aw, 

Ax + ^:^Ay, 

ax ay 




(43) 


It is interesting to note that the best approximation is not 
obtained in the origin of the exit pupil, but in the exit pupil 
position closest to the paraxial image. 

8. EXAMPLES OF WAVEFRONT AND RAY ABERRATION 
CALCULATIONS 

In the following, we provide three representative examples of 
computations with our analytic expressions. We provide a com¬ 
parison with results obtained from the classical approximation 
in Eqs. (1). The comparisons are performed for both the trans¬ 
verse ray aberrations and the reconstructed wavefront. The 
error between our computation and the classical approximation 
is studied for all three examples. 

The results are summarized in Eig. 10. Each column is ded¬ 
icated to a different test case and each row contains the same 
type of plot for each example. Eirst, we briefly describe the in¬ 
formation found in each row and later discuss the results in the 
context of each of the optical systems. 

The first row shows a scaled system diagram, together with 
the lens prescription and the imaging conditions, reporting all 
quantities in millimeters. We obtain the first order system prop¬ 
erties and a Zernike decomposition of the OPDs from OSLO, a 
commonly known Optics design software. The number of poly¬ 
nomials for the fit are set to the maximum available, yielding 37 
Zernike coefficients. We use the Zernike expansions as ground 
truth OPD functions, shown in the second row. Additionally, we 
provide the peak-to-valley (P-V) and RMS values to emphasize 
the degree of the aberrations. 

In the third row we plot the transverse ray aberrations for 
both directions, which can be considered as a re-centered spot 
diagram in the paraxial image plane. We simultaneously plot the 
exact ray aberrations ey vs. Cj and the classical ray aberrations, 
now re-named to €y vs. Cj. 

The fourth row presents a comparison between the exact ray 
aberrations and the classical approximation, by plotting the error 
in the norms of the transverse ray aberrations ||e:|| — ||e||, with 
I |e| I = (cj^ + ey^)^^^ being the norm of the exact transverse 
ray aberrations, Eqs. (42), and | |e| | = norm for 
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Fig. 10. (Color online) Multiple computations for a singlet, a Cooke triplet and a post-surgery corneal aberration map. 
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the classic equations, Eqs. (1). This surface error is scaled in 
wavelengths and plotted on a logarithmic Z axis to enhance its 
d 3 mamic range. 

Finally, the last row presents a comparison of the exactly com¬ 
puted wavefront, Eq. (35), with an approximation obtained from 
the classical equations. For computing the latter, we propose 
the following procedure: we connect the reference sphere eval¬ 
uation point p with the approximation of the aberrated image 
point i, obtained by adding the classical transverse ray aberra¬ 
tions (%/ to the paraxial image point s. The direction of this 
3D line is an approximation to the true aberrated ray direction 
n, Eq. (37). We use it in conjunction with Eq. (26) to construct 
an approximated wavefront tp. The difference in 3D positions 
\\(p{x,y) — y^{x,y)\ \ is shown in the last row, again scaled in 
wavelengths and with a logarithmic Z axis. We include this 
additional comparison, since the wavefront shape is important 
for an exact computation of wave-optical point spread functions 
using Huygens' principle. We now discuss the individual test 
cases. 

(i) Singlet: This system is one of the simplest optical set-ups 
with correspondingly large aberration values, especially at lower 
f-numbers. We chose a working //# = 4, focal length / 60mm 
at a wavelength A = 587.56nm. In the OPD plot, a large astig¬ 
matic component can be identified, along with a large P-V value. 
The spot diagram indicates that the classical approximation has 
a large error for the most external intersection points. The ray 
aberration error plot further emphasizes this fact, presenting a 
maximum in the range of 10^A. The corresponding wavefront 
error has a maximum value on the order of 8A. 

(ii) Cooke Triplet: We continue, by studying a more complex 
and well known system: the Cooke Triplet. The lens prescrip¬ 
tion is given in the plot. The imaging conditions closely resem¬ 
ble the singlet example, with a working //# = 4, focal length 
/ 50mm at a wavelength A = 587.56nm. The OPD plot in¬ 
dicates a better corrected system, with again a large astigmatic 
component. The spot diagram indicates that the intersection 
points from the classical approximation closely resemble the 
exact calculations. The errors of the classical approximation in 
both, the ray aberration and the wavefront, are now significantly 
smaller, with maximum values in the range of 7A and 2 x lO-^A, 
respectively. 

(ill) Post-Surgery Cornea: We select this example as a test case 
presenting aberrations of higher order, in contrast to the previous 
examples where the OPD functions were slowly varying. The 
Zernike coefficients are extracted from a post-LASIK-surgery 
corneal topography [11]. In the first row of Fig. 10, we present 
the list of Zernike coefficients, measured in microns, were the 
values are listed using the Noll index ordering [12]. To main¬ 
tain the context of an eye as the optical system, we select a 
working //# = 2.6, focal length / = 17mm at a wavelength 
A = 587.56nm. The OPD plot shows a more complex structure 
giving rise to an unusual spot diagram. The maximum error 
of the classical approximation in the ray aberrations is now in 
the order of 70A and the corresponding error for the wavefront 
comparison has a maximum value on the order of 3 x lO-^A. 

Comparing the errors between the cornea and the Cooke 
triplet examples, we observe the dependence of the approxi¬ 
mation error of the classical Eqs. (1) on the derivatives of the 
wave aberration function. As stated before, Eq. (42), the classical 
approximation becomes more accurate for smaller gradients of 
the OPD function. In the post-surgery cornea example, even 


though the magnitude of the aberrations is smaller than in the 
Cooke triplet case, we observe approximation errors that are an 
order of magnitude larger. 

9. CONCLUSIONS 

We have derived exact analytic expressions for the wavefront 
surface, the aberrated ray directions, and the transverse ray aber¬ 
rations for the standard definition of the optical path difference 
as a phase delay along the aberrated ray. 

A transition to the local tangent frames of the reference sphere 
enables a Huygens-like geometric construction of the wavefront 
as an envelope of spheres and yields additional constraints on 
the OPD function. We show in the Appendix, that every OPD 
function satisfying the constraints has an associated wavefront 
and that the constructed wavefronts and rays fulfill the wave- 
front properties exactly. 

The exact aberrated rays yield exact equations for the trans¬ 
verse ray aberrations. We have identified the precise conditions 
for the classical approximation to hold. The conditions differ 
from commonly held assumptions. The relevant factors are 
small OPD gradients, as opposed to the OPD magnitude, in con¬ 
junction with evaluation positions close to the paraxial image 
coordinates, as opposed to the pupil center. 

We presented numerical simulations to illustrate the errors 
arising from the classical approximation for typical scenarios. 
The simulations provide a quantitative background for the theo¬ 
retical results. 
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APPENDIX: VALIDATING THE WAVEFRONT PROPER¬ 
TIES 

We now validate the wavefront properties of the wavefront 
derived in Eq. (35) and the ray direction from Eq. (37). Two 
conditions must be met for these quantities to be compatible: 

(i) The wavefront point (p{x-p,y-p) obtained via Eq. (35) must 
have a distance of W(x^, y-p) from the point p, and 

(ii) The aberrated ray direction r\{xp,yp), Eq. (37), must be 
orthogonal to the wavefront in the point (p{xp,yp). 

The two conditions constitute the definition of a wavefront. 
In showing that they are met by the quantities in Eqs. (35) 
and (37), we prove their correctness. In addition, the proof 
also shows that every OPD function, according to the definition 
of Sect. 3, has a corresponding wavefront. 

(I) Wavefront Distance: First, from Eqs. (35) and (37), we verify 
that Eq. (26) holds. 

We need to show that ||n(xp,yp)|| = 1. For, in this case 
\\(pixp,yp) — pII = Using Eq. (36) in local coordi¬ 

nates, the result is readily obtained: 


||n(A^p,i/p)||2 


||n(0,0)||2= (-rVW)^ + M^ 

VW^T^ • TVW + nj 

x/w'^T'^ ■ T\/w + 1 - vw^r^ • rvw 

1 . ( 45 ) 
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Since the upper left 3x3 sub-matrix of transformation T~^ is 
orthonormal, it does not change the length of the normal vector 
when changing to global coordinates. 

□ 

(ii) Orthogonality of Wavefront and Aberrated Ray Direction: We 

need to show 


nixp,ypf 


d<p{xp,yp) 

dXp 

d(p{xp,yp) 

dyp 


0, and 
0. 


(46) 


The latter equality is due to 

dfljr 
dXp 

which can be verified by multiplying out the expressions. 

Term {II): is tractable with computer algebra software. Defin- 
ing 


jTn-T Y7TAT I vjiATT'-rT'-r I 


vw r' ^ vw + vw r r 

dXv 


d^W 


d^pdXp 


(50) 


We proof the equality for the x-tangent vector the deriva¬ 
tion for the y-tangent vector being strictly similar. It is important 
to perform the proof in global coordinates since the local coordi¬ 
nate system changes when changing the evaluation position p. 

Denoting the homogeneous local coordinate origin as O = 
(0,0,0,1)^, the dot-product between the tangent vector and the 
aberrated ray direction is given by 


we find that 


= 3 -^ 

oXp 


'Tx = 


^(0,0), (51) 

(52) 

□ 


3W 

dXp' 


(/) yields || (//) yields 

= 0. (47) 

Simplification of the complete Eq. (47) with a computer alge¬ 
bra package is, unfortunately, not tractable. As indicated above, 
we decompose the equation into term (I) and term (II). Term (I) 
will be shown to equal (^ for the y-tangent). Term (II) can 

be shown to equal ^or the y-tangent) with the help 

of a computer algebra software. 

Term (I): contributes the majority of cross-terms when multi¬ 
plying out Eq. (47). It can be conveniently treated in a vectorial 
fashion: 


/V T ^ /TAT/v\ /V T / 3IV 0n 


dXv 


dXv 


= h^h- -hWh^^. 

oXp oXp 

aw 

dXp ' 


(48) 


The last equality is due to the fact that h^h = 1, as shown in 
Eq. (45), and 


ah 


h^^ = vw^r^^vw + 


t-tt . 


ax 


ax. 


Summarizing, we have shown that the wavefront point and 
the aberrated ray directions derived in Sect. 6 are compatible 
with the wavefront and the OPD properties. Since the proof 
has been constructive, we have shown that every OPD function 
according to the definition of Sect. 3 has indeed an associated 
wavefront. 
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